Statistical multi-stream modeling of real-time MRI articulatory speech data

نویسندگان

Erik Bresch

Athanasios Katsamanis

Louis Goldstein

Shrikanth S. Narayanan

چکیده

This paper investigates different statistical modeling frameworks for articulatory speech data obtained using real-time (RT) magnetic resonance imaging (MRI). To quantitatively capture the spatio-temporal shaping process of the human vocal tract during speech production a multi-dimensional stream of direct image features is extracted automatically from the MRI recordings. The features are closely related, though not identical, to the tract variables commonly defined in the articulatory phonology theory. The modeling of the shaping process aims at decomposing the articulatory data streams into primitives by segmentation. A variety of approaches are investigated for carrying out the segmentation task including vector quantizers, Gaussian Mixture Models, Hidden Markov Models, and a coupled Hidden Markov Model. We evaluate the performance of the different segmentation schemes qualitatively with the help of a well understood data set which was used in an earlier study of inter-articulatory timing phenomena of American English nasal sounds.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On-line Visualization of Speech Organs Using Mri: a 3d Approach to Speech Articulation Modeling

A three-dimensional on-line visualization of the vocal tract during speech production was performed based on MRI data obtained from a female speaker producing the six Russian vowels. These images were collected using original method of 3D MRI-scanning where the starting moments of MRI processes enabled co-operative activities from a patient’s side via a special remote-control device. A strobosc...

متن کامل

Dual stream speech recognition using articulatory syllable models

Recent theoretical developments in neuroscience suggest that sublexical speech processing occurs via two parallel processing pathways. According to this Dual Stream Model of Speech Processing speech is processed both as sequences of speech sounds and articulations. We attempt to revise the “beads-on-a-string” paradigm of Hidden Markov Models in Automatic Speech Recognition (ASR) by implementing...

متن کامل

Multi-stream language identification using data-driven dependency selection

The most widespread approach to automatic language identification in the past has been the statistical modeling of phone sequences extracted from speech signals. Recently, we have developed an alternative approach to LID based on n-gram modeling of parallel streams of articulatory features, which was shown to have advantages over phone-based systems on short test signals whereas the latter achi...

متن کامل

A Multimodal Real-Time MRI Articulatory Corpus for Speech Research

We present MRI-TIMIT: a large-scale database of synchronized audio and real-time magnetic resonance imaging (rtMRI) data for speech research. The database currently consists of speech data acquired from two male and two female speakers of American English. Subjects’ upper airways were imaged in the midsagittal plane while reading the same 460 sentence corpus used in the MOCHA-TIMIT corpus [1]. ...

متن کامل

Speech Recognition Based on Syllable and Pseudo-articulatory Features

The prevailing approach to speech recognition is the statistical technique known as hidden Markov modeling (HMM), which is capable of reasonable performance in general usage (~95%) – but not much more. The major drawback is that it ignores phonetics, which has the potential for going beyond the acoustic variations to provide a more abstract underlying representation. Also, HMM only produces a s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Statistical multi-stream modeling of real-time MRI articulatory speech data

نویسندگان

چکیده

منابع مشابه

On-line Visualization of Speech Organs Using Mri: a 3d Approach to Speech Articulation Modeling

Dual stream speech recognition using articulatory syllable models

Multi-stream language identification using data-driven dependency selection

A Multimodal Real-Time MRI Articulatory Corpus for Speech Research

Speech Recognition Based on Syllable and Pseudo-articulatory Features

عنوان ژورنال:

اشتراک گذاری